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We study the predictability of emergent phenomena in complex systems. Using nearest neighbor, 
one-dimensional Cellular Automata (CA) as an example, we show how to construct local coarse- 
grained descriptions of CA in all classes of Wolfram's classification. The resulting coarse-grained CA 
that we construct are capable of emulating the large-scale behavior of the original systems without 
accounting for small-scale details. Several CA that can be coarse-grained by this construction are 
known to be universal Turing machines; they can emulate any CA or other computing devices and 
are therefore undecidable. We thus show that because in practice one only seeks coarse-grained 
information, complex physical systems can be predictable and even decidable at some level of de- 
scription. The renormalization group flows that we construct induce a hierarchy of CA rules. This 
hierarchy agrees well with apparent rule complexity and is therefore a good candidate for a com- 
plexity measure and a classification method. Finally we argue that the large scale dynamics of 
CA can be very simple, at least when measured by the Kolmogorov complexity of the large scale 
update rule, and moreover exhibits a novel scaling law. We show that because of this large-scale 
simplicity, the probability of finding a coarse-grained description of CA approaches unity as one 
goes to increasingly coarser scales. We interpret this large scale simplicity as a pattern formation 
mechanism in which large scale patterns are forced upon the system by the simplicity of the rules 
that govern the large scale dynamics. 
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I. INTRODUCTION 

The scope of the growing field of "complexity science" 
(or "complex systems" ) includes a broad variety of prob- 
lems belonging to different scientific areas. Examples 
for "complex systems" can be found in physics, biology, 
computer science, ecology, economy, sociology and other 
fields. A recurring theme in most of what is classified 
as "complex systems" is that of emergence. Emergent 
properties are those which arise spontaneously from the 
collective dynamics of a large assemblage of interacting 
parts. A basic question one asks in this context is how 
to derive and predict the emergent properties from the 
behavior of the individual parts. In other words, the cen- 
tral issue is how to extract large-scale, global properties 
from the underlying or microscopic degrees of freedom. 

In the physical sciences, there are many examples of 
emergent phenomena where it is indeed possible to re- 
late the microscopic and macroscopic worlds. Physical 
systems are typically described in terms of equations of 
motion of a huge number of microscopic degrees of free- 
dom (e.g. atoms). The microscopic dynamics is often 
erratic and complex, yet in many cases it gives rise to 
patterns with characteristic length and time scales much 
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larger than the microscopic ones (e.g. the pressure and 
temperature of a gas). These large scale patterns often 
posses the interesting, physically relevant properties of 
the system and one would like to model them or sim- 
ulate their behavior. An important problem in physics 
is therefore to understand and predict the emergence of 
large scale behavior in a system, starting from its micro- 
scopic description. This problem is a fundamental one 
because most physical systems contain too many parts to 
be simulated directly and would become intractable with- 
out a large reduction in the number of degrees of freedom. 
A useful way to address this issue is to construct coarse- 
grained models, which treat the dynamics of the large 
scale patterns. The derivation of coarse-grained models 
from the microscopic dynamics is far from trivial. In 
most cases it is done in a phenomenological manner by 
introducing various (often uncontrolled) approximations. 

The problem of predicting emergent properties is most 
severe in systems which are modelled or described by 
undecidable mathematical algorithms0,|2|. For such sys- 
tems there exists no computationally efficient way of pre- 
dicting their long time evolution. In order to know the 
system's state after (e.g.) one million time steps one must 
evolve the system a million time steps or perform a com- 
putation of equivalent complexity. Wolfram has termed 
such systems computationally irreducible and suggested 
that their existence in nature is at the root of our appar- 
ent inability to model and understand complex systems 
P, 0, 0, 0] . It is tempting to conclude from this that the 
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enterprise of physics itself is doomed from the outset; 
rather than attempting to construct solvable mathemat- 
ical models of physical processes, computational mod- 
els should be built, explored and empirically analyzed. 
This argument, however, assumes that infinite precision 
is required for the prediction of future evolution. As we 
mentioned above, usually coarse-grained or even statis- 
tical information is sufficient. An interesting question 
that arises is therefore: is it possible to derive coarse- 
grained models of undecidable systems and can these 
coarse-grained models be decidable and predictable? 

In this work we address the emergence of large 
scale patterns in complex systems and the associated 
predictability problems by studying Cellular- Automata 
(CA). CA are spatially and temporally discrete dynami- 
cal systems composed of a lattice of cells. They were orig- 
inally introduced by von Neumann and Ulam in the 
1940's as a possible way of simulating self-reproduction 
in biological systems. Since then, CA have attracted a 
great deal of interest in physics H, S H 01 because they 
capture two basic ingredients of many physical systems: 
1) they evolve according to a local uniform rule. 2) CA 
can exhibit rich behavior even with very simple update 
rules. For similar and other reasons, CA have also at- 
tracted attention in computer science 0, ^J, biology 
[T^ . material science ^3 ^^'^ many other fields. For a 
review on the literature on CA see Refs. 0: IM 0- 

The simple construction of CA makes them accessi- 
ble to computational theoretic research methods. Using 
these methods it is sometimes possible to quantify the 
complexity of CA rules according to the types of com- 
putations they are capable of performing. This together 
with the fact that CA are caricatures of physical systems 
has led many authors to use them as a conceptual vehicle 
for studying complexity and pattern formation. In this 
work we adopt this approach and study the predictability 
of emergent patterns in complex systems by attempting 
to systematically coarse-grain CA. A brief preliminary 
report of our project can be found in Ref. Il4l 

There is no unique way to define coarse-graining, but 
here we will mean that our information about the CA is 
locally coarse-grained in the sense of being stroboscopic 
in time, but that nearby cells are grouped into a supercell 
according to some specified rule (as is frequently done in 
statistical physics). Below we shall frequently drop the 
qualifier "local" whenever there is no cause for confusion. 
A system which can be coarse-grained is compact-able 
since it is possible to calculate its future time evolution 
(or some coarse aspects of it) using a more compact al- 
gorithm than its native description. Note that our use 
of the term compact-able refers to the phase space re- 
duction associated with coarse-graining, and is agnostic 
as to whether or not the coarse-grained system is decid- 
able or undecidable. Accordingly, we define predictable to 
mean that a system is decidable or has a decidable coarse- 
graining. Thus, it is possible to calculate the future time 
evolution of a predictable system (or some coarse aspects 
of it) using an algorithm which is more compact than 



both the native and coarse-grained descriptions. 

Our work is organized as follows. In section^jwe give 
an introduction to CA and their use in the study of com- 
plexity. In section UTTI we present a procedure for coarse- 
graining CA. Section Hvl shows and discusses the results 
of applying our procedure to one dimensional CA. Most 
of the CA that we attempt to coarse-grain are Wolfram's 
256 elementary rules for nearest-neighbor CA. We will 
also consider a few other rules of special interest. In sec- 
tion we consider whether the coarse-grain-ability of 
many CA that we found in the elementary rule family 
is a common property of CA. Using computational the- 
oretic arguments we argue that the large scale behavior 
of local processes must be very simple. Almost all CA 
can therefore be coarse-grained if we go to a large enough 
scale. Our results are summarized and discussed in I VII 



II. CELLULAR AUTOMATA 

Cellular automata are a class of homogeneous, local 
and fully discrete dynamical systems. A cellular automa- 
ton A= (a (t) , {Sa}, Ja) is composed of a lattice a{t) of 
cells that can each assume a value from a finite alpha- 
bet {5'a}. We denote individual lattice cells by (t) 
where the indexing refiects the dimensionality and geom- 
etry of the lattice. Cell values evolve in discrete time 
steps according to the pre- prescribed update rule /a- 
The update rule determines a cell's new state as a func- 
tion of cell values in a finite neighborhood. For exam- 
ple, in the case of a one dimensional, nearest-neighbor 
CA the update rule is a function /a : {S'a}^ {Sa} 
and a„ {t + I) = /a [a„_i (t) , a„ (t) , a„+i (t)]. At each 
time step, each cell in the lattice applies the update 
rule and updates its state accordingly. The application 
of the update rule is done in parallel for all the cells 
and all the cells apply the same rule. We denote the 
application of the update rule on the entire lattice by 
a{t+l)^ fA-a{t). 

In early work [E 0, 0, Ell , Wolfram proposed that CA 
can be grouped into four classes of complexity. Class 1 
consists of CA whose dynamics reaches a steady state re- 
gardless of the initial conditions. Class 2 consists of CA 
whose long time evolution produces periodic or nested 
structures. CA from both of these classes are simple 
in the sense that their long time evolution can be de- 
duced from running the system a small number of time 
steps. On the other hand, class 3 and class 4 consist of 
"complex" CA. Class 3 CA produce structures that seem 
random. Class 4 CA produce localized structures that 
propagate and interact in a complex way above a regu- 
lar background. This classification is heuristic and the 
assignment of CA to the four classes is somewhat subjec- 
tive. Successive works on CA attempted to irnprove it or 
to find better alternatives O El El El [13 [II [H . To 
the best of our knowledge there is, to date, no universally 
agreed upon classification scheme of CA. 

Based on numerical experiments. Wolfram hypothe- 
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sized that most of class 3 and 4 CA are Computation- 
ally Irreducible^, 0, . Namely, the evolution of these 
CA cannot be predicted by a process which is drastically 
more efficient than themselves. In order to calculate the 
state of a Computationally Irreducible CA after t time 
steps, one must run the CA for t time steps or perform a 
computation of equivalent complexity. This definition is 
somewhat loose because it is not always clear how to 
compare computation running times and efficiency on 
different architectures. In addition, Wolfram recognized 
that even computationally irreducible systems may have 
some "superficial reducibility" (see page 746 in Ref. 0) 
and can be reduced to a limited extent. The difference 
between "superficial" and true reducibility however is not 
well defined. It is nevertheless clear that the asymptotic 
t — > cxD behavior of a Computationally Irreducible sys- 
tem cannot be predicted by any computation of finite 
size. Wolfram further argued that Computationally Irre- 
ducible systems are abundant in nature and that this fact 
explains our inability as physicists to deal with complex 
systems [lllllll. 

It is difficult in general to tell whether a CA, behav- 
ing in an apparently complex way, is Computationally 
Irreducible. More concrete properties of CA which are 
related to Computational Irreducibility are Undecidabil- 
ity and Universality. Mathematical processes are said 
to be undecidable when there can be no algorithm that 
is guaranteed to predict their outcome in a finite time. 
Equivalently, CA are said to be undecidable when aspects 
of their dynamics are undecidable. Computationally Ir- 
reducible CA are therefore Undecidable and in the weak 
asymptotic definition that we gave above. Computational 
Irreducibility is equivalent to Undecidability. For lack of 
a better choice we adopt this asymptotic definition and 
in the reminder of this work we will use the two terms 
interchangeably. 

Some CA are known to be universal Turing 
machines j23j and are capable of performing all computa- 
tions done by other processes. A famous two dimensional 
example is Conway's game of life|23; several examples in 
one dimension are Lindgren and Nordahl [2^ , Albert and 
Culik and Wolfram's rule 110 fl. Universal CA are, 
in a sense, maximally complex because they can emulate 
the dynamics of all other CA. Being universal Turing 
machines, these CA are subject to undecidable questions 
regarding their dynamics[l5j. For example whether an 
initial state will ever decay into a quiescent state is the 
CA equivalence of the undecidable halting problem p^. 
Universal CA are therefore Undecidable. 

Wolfram's classification of CA is topological in the 
sense that CA are classified according to the properties of 
their trajectories. A different, more ambitious, approach 
is to classify CA according to a parameter derived di- 
rectly from their rule tables. Langton suggested that 
CA rules can be parameterized by his A parameter which 
measures the fraction of non-quiescence rule table en- 
tries. He showed a strong correlation between the value 
of A and the complexity found in the CA trajectories. For 



small values of A one characteristically finds class 1 and 
2 behavior while for A ~ 1 a class 3 behavior is usually 
observed. Langton identified a narrow region of inter- 
mediate values of A where he found class 4 characteristic 
behavior. Based on these observations Langton proposed 
the edge of chaos hypothesis [2^. This hypothesis claims 
that in the space of dynamical systems, interesting sys- 
tems which are capable of computation are located at the 
boundary between simple and chaotic systems. This ap- 
pealing hypothesis however was criticized in later works 
j28j. Recently, a different parametrization of CA rule 
tables was proposed by Dubacq et al. (2^. This new 
approach is based on the information content of the rule 
table as measured by its Kolmogorov Complexity. As 
we will show below, our results lend support to this no- 
tion and indicate that rule tables with low Kolmogorov 
complexity lead to simple behavior and vice versa. 

In addition to attempts to find order and hierarchy in 
the space of CA rules, much research has been devoted to 
the study of CA classes with special properties. Additive 
C A (or linear) [33, |^ , commuting C A |3^ and CA with 
certain algebraic properties |33l l34j are a few examples. 
Unsurprisingly, the dynamics of CA which enjoy such 
special properties can in most cases be understood and 
predictable at some level. 

In this work we will mostly be concerned with the fam- 
ily of one dimensional, nearest neighbor binary CA that 
were the subject of Wolfram's investigations. These 256 
elementary rules are among the simplest imaginable CA 
and thus present us with the least computational chal- 
lenges when attempting to coarse-grain them. We will 
use Wolfram's notation[3 for identifying individual rules. 
The update function of an elementary rule is described by 
a rule number between and 255. The eight bit binary 
representation of the rule number specifies the update 
function outcome for the eight possible three cell config- 
urations (where "000" is the least significant and "111" is 
the most significant bit). CA are often conveniently visu- 
alized with different colors denoting different cell values. 
When dealing with binary CA we will use the convention 
□ = 0, ■ = 1 and use the two notations interchangeably. 



III. LOCAL COARSE-GRAINING OF 
CELLULAR AUTOMATA 

We now turn to study the emergence of large scale 
patterns in CA and the associated predictability prob- 
lems by attempting to coarse-grain CA. There are many 
ways to define a coarse-graining of a dynamical sys- 
tem. In this work we define it as a (real-space) 
renormalization scheme where the original CA A = 
{a{t) ,{Sa}, Ja) is coarse-grained to a renormhzed CA 
B = {b{t) ,{Sb}, fs) through the lattice transforma- 
tion bk = P {aN.k,aN-k+i, ■ ■ ■ ,aN-k+N-i)- The projec- 
tion function P : {Sa}^ {^b} projects the value of 
a block of N cells in A, which we term a supercell, to a 
single cell in B. By writing P-a we denote the block- wise 
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application of P on the entire lattice a. Only non-trivial 
cases where P is irreversible are considered because we 
want B to provide a partial account of the full dynamics 
of A. 

In order for B and P to provide a coarse-grained emu- 
lation of A they must satisfy the commutativity condition 

P-fl- a(0) = fB-P- a(0) , (1) 

for every initial condition a(0) of A. The constant T in 
the above equation is a time scale associated with the 
coarse-graining. A repeated application of Eq. shows 
that 

P ■ /J * • a(0) = /ij • P ■ a(0) , (2) 

for all t. Namely, running the original CA for T ■ t time 
steps and then projecting is equivalent to projecting the 
initial condition and then running the renormalized CA 
for t time steps. Thus, if we are only interested in the 
projected information we can run the more efficient CA 
B. 

Renormalization group transformations in statistical 
physics are usually performed with projection operators 
that arise from a physical intuition and understanding 
of the system in question. Majority rules and different 
types of averages are often the projection operators of 
choice. In this work we have the advantage that the CA 
we wish to coarse-grain arc fully discrete systems and 
the number of possible projections of a supercell of size 
A'' is finite. We will therefore consider all possible (at 
least with small supercells) projection operators and will 
not restrict ourselves to coarse-graining by averaging. In 
addition, the discrete nature of CA makes it very difficult 
to find useful approximate solutions of Eq. iQJ because 
there is no natural small parameter that can be used 
to construct perturbative coarse-graining schemes. We 
therefore require that Eq. ^ is satisfied exactly. 



A. Coarse-graining procedure 

We now define a simple procedure for coarse-graining 
CA. Other constructions are undoubtedly possible. For 
simplicity we limit our treatment to one-dimensional sys- 
tems with nearest neighbor interactions. Generalizations 
to higher dimensions and different interaction radii are 
straightforward. 

The commutativity condition Eq. implies that the 
renormlized CA B is homomorphic to the dynamics of A 
on the scale defined by the supercell size TV. To search 
for explicit coarse-graining rules, we define the iV'th su- 
percell version A'^ = (a^, {^^^jv}, /^w) of A. Each cell 
of A^ represents N cells of A and accepts values from 
the alphabet = {Sa}^ which includes all possible 

configurations of N cells in A. The transition function 
fj^N of the supercell CA can be defined in many ways 



depending on our choice of the supercells interaction ra- 
dius. Here we choose A^ to be a nearest neighbor CA 
and compute /^w : {5^jv}'^ — * {S^n} by running A for 
N time steps on all possible initial conditions of length 
3A^. In this way A^ follows the dynamics of A and each 
application of A^ computes the evolution of a block of N 
cells of A, for N time steps. This choice will later result 
in a coarse-grained CA B which is itself nearest-neighbor. 
This is convenient because it enables us to compare the 
original and coarse-grained systems. Another convenient 
feature of this construction is that it renders the coarse- 
graining time scale T equal to the supercell size N . Other 
constructions however are undoubtedly possible. Note 
that A^ is not a coarse-graining of A because no infor- 
mation was lost in the cell translation. 

Next we attempt to generate the coarse CA B by pro- 
jecting the alphabet of A^ on a subset {Sb} C {Sj^n} 
which will serve as the alphabet of B. This is the key step 
where information is being lost. The transition function 
Jb is constructed from Jj^n by projecting its arguments 
and outcome: 

Jb [P[xi),P{x2). P[x^)] = P if AN [xi,X2,X3]) . (3) 

Here P (x) denotes the projection of the supercell value 
X. This construction is possible only if 

P{fA«[xi,x2,x3]) = P{fAN[yi,y2,y3]), 

V{x,y\P{x,)=P{y,)). (4) 

Otherwise, /b is multi-valued and our coarse-graining 
attempt fails for the specific choice of N and P. 

Equations © and can also be cast in the matrix 
form 

P • = B • Pa , (5) 

which may be useful. Here A^ is an S^n x {Sa«)^ matrix 
which specify the N cell block output for every possible 
combination of 3N cells. P is an Sb x S^n matrix that 
project from S^n to Sb- P3 is a (Sb)^ x (S'^n)^ matrix 
which projects 3 consecutive super cells and is a (simple) 
function of P. The coarse-grained CA B is an Sb x (Sb)^ 
matrix and is also a function of P. This is a greatly 
over determined equation for the projection operator P. 
For a given value of N and Sb the equation contains 
5b x {Sa«)^ constraints while P is defined by Sa« free 
parameters. 

In cases where Eq. Q is satisfied, the resulting CA B 
is a coarse-graining of with a time scale T = 1. For 
every step (t + 1) = f^N [a^_i(t),a^ it),a^+iit)] of 
A^, B makes the move 

bn{t + 1) = fB[bn-l{t),b,,{t),b,,+ i{t)] (6) 

= K-i(t),a^(i),a:r+i(t)]) 
= P{a:^it+l)) , 

and therefore satisfies Eq. Since a single time 

step of A^ computes N time steps of A, B is also a 



5 



coarse-graining of A with a coarse-grained time scale 
T = N . Analogies of these operators have been used 
in attempts to reduce the computational complexity of 
certain stochastic partial differential equations |35l l36l | . 
Similar ideas have been used to calculate critical expo- 
nents in probabilistic CA [stI Issj l . 

To illustrate our method let us give a simple example. 
Rule 128 is a class 1 elementary CA defined on the {□, ■} 
alphabet with the update function 



/l28 [Xn-l-,Xn,Xn+l\ — 



(7) 



Figure 121 b) shows a typical evolution of this simple rule 
where all black regions which arc in contact with white 
cells decay at a constant rate. To coarse-grain rule 128 wc 
choose a supcrcell size N — 2 and calculate the superccU 
update function 



/l28 [yn-l,yn,yn+l] 



mm , y„-i, y„, y„+i = ■■, ■ 

□■ , y„_i, y„, y„+i = □■, ■ 
■□ , y„_i, y„, y„+i = ■■, ■ 
, □□ , all other combinations 



ID 



(8) 



Next we project the superccll alphabet using 

Piy) = l°^',tlZ ■ (9) 



Namely, the value of the coarse-grained cell is black only 
when the supercell value corresponds to two black cells. 
Applying this projection to the supercell update function 
Eq. (jSJ we find that 



^(/l28 [P{yn^l),P{yn),P{yn+l)]) = 

m, P(y„_i),P(y„),P(y„+i) = 
□ , P(2/„_i),P(y„),F(y„+i) 7^ 



,(10) 



which is identical to the original update function /i28- 
Rule 128 can therefore be coarse-grained to itself, an ex- 
pected result due to the scale invariant behavior of this 
simple rule. 



B. Relevant and irrelevant degrees of freedom 

It is interesting to notice that the above coarse-graining 
procedure can lose two very different types of dynamic 
information. To see this, consider Eq. This equation 
can be satisfied in two ways. In the first case 



Jan [xi,X2,X3] 



fA« [2/1,^2,2/3] , 
V(x,y|P(x-,)-P(2/,)) , 



(11) 



which necessarily leads to Eq. Q). /^w in this case is 
insensitive to the projection of its arguments. The dis- 
tinction between two variables which arc identical under 



projection is therefore irrelevant to the dynamics of , 
and by construction to the long time dynamics of A. By 
eliminating irrelevant degrees of freedom (DOF), coarse- 
graining of this type removes information which is re- 
dundant on the microscopic scale. The coarse CA in this 
case accounts for all possible long time trajectories of the 
original CA and the complexity classification of the two 
CA is therefore the same. 

In the second case Eq. is satisfied even though Eq. 
()ll|l is violated. Here the distinction between two vari- 
ables which are identical under projection is relevant to 
the dynamics of A. Replacing a: by 1/ in the initial con- 
dition may give rise to a difference in the dynamics of A. 
Moreover, the difference can be (and in many occasions 
is) unbounded in space and time. Coarse-graining in this 
case is possible because the difference is constrained in 
the cell state space by the projection operator. Namely, 
projection of all such different dynamics results in the 
same coarse-grained behavior. Note that the coarse CA 
in this case cannot account for all possible long time tra- 
jectories of the original one. It is therefore possible for 
the original and coarse CA to fall into different complex- 
ity classifications. 

Coarse-graining by elimination of relevant DOF re- 
moves information which is not redundant with respect to 
the original system. The information becomes redundant 
only when moving to the coarse scale. In fact, "redun- 
dant" becomes a subjective qualifier here since it depends 
on our choice of coarse description. In other words, it de- 
pends on what aspects of the microscopic dynamics we 
want the coarse CA to capture. 

Let us illustrate the difference between coarse-graining 
of relevant and irrelevant DOF. Consider a dynamical 
system whose initial condition is in the vicinity of two 
limit cycles. Depending on the initial condition, the sys- 
tem will flow to one of the two cycles. Coarse-graining 
of irrelevant DOF can project all the initial conditions 
on to two possible long time behaviors. Now consider 
a system which is chaotic with two strange attractors. 
Coarse-graining irrelevant DOF is inappropriate because 
the dynamics is sensitive to small changes in the initial 
conditions. Coarse-graining of relevant DOF is appropri- 
ate, however. The resulting coarse-grained system will 
distinguish between trajectories that circle the first or 
second attractor, but will be insensitive to the details 
of those trajectories. In a sense, this is analogous to 
the subtleties encountered in constructing renormaliza- 
tion group transformations for the critical behavior of 
antiferromagnetsjs^ 23 • 



IV. RESULTS OF COARSE-GRAINING ONE 
DIMENSIONAL CA 

A. Overview 

The coarse-graining procedure we described above is 
not constructive, but instead is a self-consistency con- 
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dition on a putative coarse-graining rule with a specific 
supercell size N and projection operator P. In many 
cases the single-valuedness condition Eq. is not sat- 
isfied, the coarse-graining fails and one must try other 
choices of N and P. It is therefore natural to ask the 
following questions. Can all CA be coarse-grained? If 
not, which CA can be coarse-grained and which cannot? 
What types of coarse-graining transitions can we hope to 
find? 

To answer these questions we tried systematically to 
coarse-grain one dimensional CA. Wc considered Wol- 
fram's 256 elementary rules and several non-binary CA 
of interest to us. Our coarse-graining procedure was ap- 
plied to each rule with different choices of N and P. In 
this way we were able to coarse-grain 240 out of the 256 
elementary CA. These 240 coarse-grained-able rules in- 
clude members of all four classes. The 16 elementary CA 
which wc could not coarse-grain are rules 30, 45, 106, 
154 and their symmetries. Rules 30, 45 and 106 belong 
to class 3 while 154 is a class 2 rule. We don't know if 
our inability to coarse-grain these 16 rules comes from 
limited computing power or from something deeper. We 
suspect (and give arguments in Section 0) the former. 

The number of possible projection operators P grows 
very fast with N . Even for small N, it is computation- 
ally impossible to scan all possible P. In order to find 
valid projections, we therefore used two simple search 
strategies. In the first strategy, wc looked for coarse- 
graining transitions within the elementary CA family by 
considering P which project back on the binary alpha- 
bet. Excluding the trivial projections P{x) = 0, Vx and 
P{x) = 1, Va; there are 2^" —2 such projections. We were 
able to scan all of them for N < 4 and found many coarse- 
graining transitions. Figurenshows a map of the coarse- 
graining transitions that we found within the family of 
elementary rules. An arrow in the map indicates that 
each rule from the origin group can be coarse-grained to 
each rule from the target group. The supercell size N and 
the projection P are not shown and each arrow may cor- 
respond to several choices of N and P. As we explained 
above, only coarse-grainings with A'^ < 4 are shown due 
to limited computing power. Other transitions within 
the elementary rule family may exist with larger values 
of N. This map is in some sense an analogue of the famil- 
iar renormalization group flow diagrams from statistical 
mechanics. 

Several features of Fig. are worthy of a short dis- 
cussion. First, notice that the map manifests the 
"left" <-> "right" and "0"^"1" symmetries of the elemen- 
tary CA family. For example rules 252, 136 and 238 
are the "0"^"1", "left" ^ "right" and the "0"^"1" and 
"left" <-> "right" symmetries of rule 192 respectively. Sec- 
ond, coarse-graining transitions are obviously transitive, 
i.e. if A goes to B with Ni and B goes to C with 7V2 then 
A goes to C with N < Ni ■ N2. For some transitions, the 
map in Fig. ^ fails to show this property because we did 
not attain large enough values of N. 

Another interesting feature of the transition map is 



that the apparent rule complexity never increases with a 
coarse-graining transition. Namely, we never find a sim- 
ple behaving rule which after being coarse-grained be- 
comes a complex rule. The transition map, therefore, 
introduces a hierarchy of elementary rules and this hier- 
archy agrees well with the apparent rule complexity. The 
hierarchy is partial and we cannot relate rules which are 
not connected by a coarse-graining transition. As op- 
posed to the Wolfram classification, this coarse-graining 
hierarchy is well defined and is therefore a good candi- 
date for a complexity measured [H [H El lH IM IH 

Finally notice that the eight rules 0, 60, 90, 102, 150, 
170, 204, 240, whose update function has the additive 
form 

fal3y [Xn-l,Xn,Xn+l] = « • Xn- 1 © /? • X„ © 7 • a;„+i , 

a,A7e{0,l}, (12) 

where denotes the XOR operation, are all fixed points 
of the map. This result is not limited to elementary rules. 
As showed by Barbe et.al HHHIil , additive CA in arbi- 
trary dimension whose alphabet sizes are prime numbers 
coarse-grain themselves. We conjecture that there are 
situations where reducible fixed points exist for a wide 
range of systems, analogous to the emergence of ampli- 
tude equations in the vicinity of bifurcation points. 

When projecting back on the binary alphabet, one 
maximizes the amount of information lost in the coarse- 
graining transition. At first glance, this seems to be an 
unlikely strategy, because it is difficult for the coarse- 
grained CA to emulate the original one when so much 
information was lost. In terms of our coarse-graining 
procedure such a projection maximizes the number of in- 
stances P{x) = P{y) of Eq. I^J. On second examination, 
however, this strategy is not that poor. The fact that 
there are only two states in the coarse-grained alphabet 
reduces the probability that an instance P{x) = P{y) of 
Eq. will be violated to 1/2. The extreme case of this 
argument would be a projection on a coarse-grained al- 
phabet with a single state. Such a trivial projection will 
never violate Eq. (but will never show any patterns 
or dynamics either). 

A second search strategy for valid projection operators 
that we used is located on the other extreme of the above 
tradeoff. Namely, we attempt to lose the smallest pos- 
sible amount of information. Wc start by choosing two 
supercell states zi and Z2 and unite them using 

= ' (13) 

where the subscript in Pq denotes that this is an initial 
trial projection to be refined later. The refinement pro- 
cess of the projection operator proceeds as follows. If P„ 
(starting with n = 0) satisfies Eq. Q then we are done. 
If on the other hand, Eq. Q is violated by some 

PnifA'^ [X1,X2,X3]) ^ Pn{fA'^[yi,y2,y3]) , 

Pn{x^) = Pn{yi) , (14) 
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FIG. 1: Coarse-graining transitions within the family of 256 elementary CA. Only transitions with a supercell size A'^ = 2, 3, 4 
are shown. An arrow indicates that the origin rules can be coarse-grained by the target rules and may correspond to several 
choices of N and P. 



the inequality is resolved by refining P„ to 



Pn+i (x) 



, + 1 ^X)-<, p^^^^^ ^^^^^ , 

ri = fAN[xi,X2,X'i\ ,r2 = /a« [yi,y2,2/3] -(15) 



This process is repeated until Eq. is satisfied. A 
non-trivial coarse-graining is found in cases where the 
resulting projection operator is non-constant (more than 
a single state in the coarse-grained CA). 

By trying all possible zi, Z2 initial pairs, the above pro- 
jection search method is guaranteed to find a valid pro- 
jection if such a projection exist on the scale defined by 
the supercell size N . Using this method we were able 
to coarse-grain many CA. The resulting coarse-grained 
CA that are generated in this way are often multicolored 
and do not belong to the elementary CA family. For 
this reason it is difficult to graphically summarize all the 
transitions that we found in a map. Instead of trying to 
give an overall view of those transitions we will concen- 
trate our attention on several interesting cases which we 
include in the examples section bellow. 



B. Examples 

1. Rule 105 

As our first example we choose a transition between 
two class 2 rules. The elementary rule 105 is defined on 
the alphabet {□, ■} with the transition function 



/i05 [a 



(16) 



where the over-bar denotes the NOT operation, and 
□ = 0, ■ = 1. We use a supercell size N = 2 and calcu- 
late the transition function /^q5, defined on the alphabet 



{□□,□■, I 

on the {□, 



ID, ■■}. Now we project this alphabet back 
■ } alphabet with 



P{x) 



□ , X = □■,■□ 
■ , X = □□,■■ 



(17) 



A pair of cells in rule 105 are coarse-grained to a single 
cell and the value of the coarse cell is black only when the 
pair share a same value. Using the above projection op- 
erator we construct the transition function of the coarse 
CA. The result is found to be the transition function of 
the additive rule 150: 



fl50 [Xn — l : Xyi , X 



n+lj 



n+1 



(18) 



8 




Figure |2] shows the results of this coarse-graining tran- 
sition. In Fig. 121 (a) we show the evolution of rule 105 
with a specific initial condition while Fig. |21 (b) shows 
the evolution of rule 150 from the coarse-grained initial 
condition. The small scale details in rule 105 are lost 
in the transformation but extended white and black re- 
gions are coarse-grained to black regions in rule 150. The 
time evolution of rule 150 captures the overall shape of 
these large structures but without the black-white dec- 
orations. As shown in Fig. ^ rule 150 is a fixed point 
of the transition map. Rule 105 can therefore be further 
coarse-grained to arbitrary scales. 



2. Rule 146 

As a second example of coarse-grained-able elementary 
CA we choose rule 146. Rule 146 is defined on the {□, ■} 
alphabet with the transition function 



/l46 [ 

□ , all other combinations 



(19) 



It produces a complex, seemingly random behavior which 
falls into the class 3 group. We choose a superccU size 
A'^ = 3 and calculate the transition function f^^Q , defined 
on the alphabet {□□□, . . . ■■■}. Now we project 

this alphabet back on the {□, ■} alphabet with 



Pix) 



D, x^ 



(20) 



Namely, a triplet of cells in rule 146 are coarse-grained 
to a single cell and the value of the coarse cell is black 
only when the triplet is all black. Using the above pro- 
jection operator we construct the transition function of 
the coarse CA. The result is found to be the transition 
function of rule 128 which was given in Eq. {Tj). Rule 
146 can therefore be coarse-grained by rule 128, a class 



1 elementary CA. In FigureOwe show the results of this 
coarse-graining. Fig. |31 (a) shows the evolution of rule 
146 with a specific initial condition while Fig. 01 (b) shows 
the evolution of rule 128 from the coarse-grained initial 
condition. Our choice of coarse-graining has eliminated 
the small scale details of rule 146. Only structures of lat- 
eral size of three or more cells are accounted for. The de- 
cay of such structures in rule 146 is accurately described 
by rule 128. 

Note that a class 3 CA was coarse-grained to a class 1 
CA in the above example. Our gain was therefore two- 
fold. In addition to the phase space reduction associated 
with coarse-graining we have also achieved a reduction in 
complexity. Our procedure was able to find predictable 
coarse-grained aspects of the dynamics even though the 
small scale behavior of rule 146 is complex, potentially 
irreducible. 

Rule 146 can also be coarse-grained by non elemen- 
tary CA. Using a supercell size of iV = 6 we found that 
the difference between the combinations and 
is irrelevant to the long time behavior of rule 
146. It is therefore possible to project these two com- 
binations into a single coarse grained state. The same 
is true for the combinations and 
which can be projected to another coarse-grained state. 
The end result of this coarse-graining (Fig. O (c)) is a 
62 color CA which retains the information of all other 
6 cell combinations. The amount of information lost in 
this transition is relatively small, 2/64 of the supercell 
states have been eliminated. More impressive alphabet 
reductions can be found by going to larger scales. For 
Af=7,8,9,10 and 11 we found an alphabet reduction of 
9/128, 33/256, 97/512, 261/1024 and 652/2048 respec- 
tively. Fig.|31(d) shows the percentage of states that can 
be eliminated as a function of the supercell size TV. All 
of the information lost in those coarsc-grainings corre- 
sponds to irrelevant DOF. 

The two different coarse-graining transitions of rule 
146 that we presented above are a good opportunity to 
show the difference between relevant and irrelevant DOF. 
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FIG. 3: (Color online) Coarse-graining of rule 146 by rule 128 and by a 62 color CA. (a) shows results of running rule 146. 
The top line is the initial condition and time progress from top to bottom, (b) shows the results of running rule 128 with the 
coarse grained initial condition from (a) . (c) shows results of running the 62 color CA which is a coarse-grained version of rule 
146. (d) shows the percentage of supercell states that can be eliminated when coarse graining rule 146 with different supercell 
sizes A^. 



As we explained earlier, a transition like 146— >128 where 
the rules has different complexities must involve the elim- 
ination of relevant DOF. Indeed if we modify an initial 
condition of rule 146 by replacing a □■□ segment with 
□■■ we will get a modified evolution. As we show in 
Figure 01 the difference in the trajectories has a complex 
behavior and is unbounded in space and time. However, 
since □■□ and □■■ are both projected by Eq. H2()|l to 
□ , the projections of the original and modified trajecto- 
ries will be identical. In contrast, the coarse graining of 
rule 146 to the 62 state CA of Fig.|31(c) involves the elim- 
ination of irrelevant DOF only. If wc replace a 
in the initial condition with a we find that the 

difference between the modified and unmodified trajec- 
tories decays after a few time steps. 



3. Rule 184 

The elementary CA rule 184 is a simplified one lane 
traffic flow model. Its transition function is given by 

/l84 [a^n-l, 2:„, Xn+i] = 

r □ , a:„_ia;„a;„+i = 

\ ■ , Xn-iXnXn+i = ^^'> 

Identifying a black cell with a car moving to the right and 
a white cell with an empty road segment we can rewrite 
the update rule as follows. A car with an empty road 
segment to its right advances and occupies the empty 
segment. A car with another car to its right will avoid a 
collision and stay put. This is a deterministic and sim- 
plified version of the more realistic Nagel Schreckenberg 
model HI. 

Rule 184 can be coarse-grained to a 3 color CA using 
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n 

FIG. 4: The sensitivity of rule 146 to a relevant DOF change 
in its initial condition. The figure shows the difference (mod- 
ulo 2) in the trajectories resulting from replacing a □■□ seg- 
ment in the initial condition with 



a supercell size N ^ 2 and the local density projection 

( D, x = DD 
P{x)^ { ■ , a; = . (22) 



The update function of the resulting CA is given by 

f[yn-i,y7i,yn+i]^ (23) 

□ , yn-iynVn+i = □■■ 

■ , Vn-iVnyn+i = ■■■ . 

■ , all other combinations 

Figureinishows the result of this coarse-graining. Fig. [3 
(a) shows a trajectory of rule 184 while Fig. [Sl(b) shows 
the trajectory of the coarse CA. From this figure it is 
clear that the white zero density regions correspond to 
empty road and the black high density regions correspond 
to traffic jams. The density 1/2 grey regions correspond 
to free fiowing traffic with an exception near traffic jams 
due to a boundary effect. 

By using larger supercell sizes it is possible to find 
other coarse-grained versions of rule 184. As in the above 
example, the coarse-grained states group together local 
configurations of equal car densities. The projection op- 
erators however are not functions of the local density 
alone. They are a partition of such a function and there 
could be several coarse-grained states which correspond 
to the same local car density. We found (empirically) 
that for even supercell sizes N = 2k the coarse-grained 
CA contain k'^/2 + 3k/2 + l states and for odd supercell 
sizes iV = 2fc -|- 1 they contain k'^ + 3k + 2 states. Figure 
|S1 (c) shows the amount of information lost in those tran- 
sitions as a function of N. Most of the lost information 
corresponds to relevant DOF but some of it is irrelevant. 



4. Rule 110 

Rule 110 is one of the most interesting rules in the el- 
ementary CA family. It belongs to class 4 and exhibits a 
complex behavior where several types of "particles" move 
and interact above a regular background. The behavior 
of these "particles" is rich enough to support universal 
computation 0. In this sense rule 110 is maximally com- 
plex because it is capable of emulating all computations 
done by other computing devices in general and CA in 
particular. As a consequence it is also undecidable psj . 

We found several ways to coarse-grain rule 110. Using 

= 6, it is possible to project the 64 possible supercell 
states onto an alphabet of 63 symbols. Figure El (a) and 
(b) shows a trajectory of rule 110 and the corresponding 
trajectory of the coarse-grained 63 states CA. A more 
impressive reduction in the alphabet size is obtained by 
going to larger values of N . For N = 7,8, 9, 10, 11, 12 we 
found an alphabet reduction of 6/128, 22/256, 67/512, 
182/1024, 463/2048 and 1131/4096 respectively. Only 
irrelevant DOF are eliminated in those transitions. Fig. 
ini(c) shows the percentage of reduced states as a function 
of the supercell size N. We expect this behavior to persist 
for larger values of N. 

Another important coarse-graining of rule 110 that we 
found is the transition to rule 0. Rule has the trivial 
dynamics where all initial states evolve to the null con- 
figuration in a single time step. The transition to rule 
is possible because many cell sequences cannot appear 
in the long time trajectories of rule 110. For example 
the sequence is a so called "Garden of Eden" 

of rule 110. It cannot be generated by rule 110 and can 
only appear in the initial state. Coarse-graining by rule 
is achieved in this case using A^ = 5 and projecting 
to ■ and all other five cell combinations to □. 
Another example is the sequence 

This sequence is a "Garden of Eden" of the A^ = 13 su- 
percell version of rule 110. It can appear only in the first 
12 time steps of rule 110 but no later. Coarse-graining 
by rule is achieved in this case using A = 13 and pro- 
jecting to ■ and all other 13 cell 
combinations to □. These examples are important be- 
cause they show that even though rule 110 is undecidable 
it has decidable and predictable coarse-grained aspects 
(however trivial). To our knowledge rule 110 is the only 
proven undecidable elementary CA and therefore this is 
the only (proven) example of undecidable to decidable 
transition that we found within the elementary CA fam- 

iiy- 

It is interesting to note that the number of "Garden of 
Eden" states in supercell versions of rule 110 grows very 
rapidly with the supercell size A^. As we show in Fig. El 
(d) , the fraction of "Garden of Eden" states out of the 2^ 
possible sequences, grows almost linearly with N. In ad- 
dition, at every scale A^ there are new "Garden of Eden" 
sequences which do not contain any smaller "Gardens of 
Eden" as subsequences. These results are consistent with 
our understanding that even though the dynamics looks 
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FIG. 5: Coarse graining of rule 184 by a 3 state CA. (a) shows a trajectory of rule 184. (b) shows the corresponding trajectory 
of the coarse-grained CA. (c) shows the percentage of supercell states that can be eliminated when coarse graining rule 184 
with different supercell sizes A''. 



complex, more and more structure emerges as one goes 
to larger scales. We will have more to say about this in 
section Ivl 

The "Garden of Eden" states of supercell versions of 
rule 110 represent pieces of information that can be used 
in reducing the computational effort in rule 110. The re- 
duction can be achieved by truncating the supercell up- 
date rule to be a function of only non "Garden of Eden" 
states. The size of the resulting rule table will be much 
smaller (si 3% with N = 21) than the size of the supercell 
rule table. Efficient computations of rule 110 can then be 
carried out by running rule 110 for the first N time steps. 
After N time steps the system contains no "Garden of 
Eden" sequences and we can continue to propagate it by 
using the truncated supercell rule table without loosing 
any information. Note that we have not reduced rule 110 
to a decidable system. At every scale we achieved a con- 
stant reduction in the computational effort. Wolfram has 
pointed out that many irreducible systems have pockets 
of reducibility and termed such a reduction as "super- 
ficial rcducibility" (see page 746 in Ref. 0). It will be 



interesting to check how much "superficial reducibility" 
is contained in rule 110 at larger scales. It will be inap- 
propriate to call it "superficial" if the curve in Fig. (d) 
approaches 100% in the large N limit. 



5. Albert and Culik universal CA 

It might be argued that the coarse-graining of rule 110 
by rule is a trivial example of an undecidable to a de- 
cidable coarse-graining transition. The fact that certain 
configurations cannot be arrived at in the long time be- 
havior is not very surprising and is expected of any irre- 
versible system. In order to search for more interesting 
examples we studied other one dimensional universal CA 
that we found in the literature. Lindgren and Nordahl 
|25| constructed a 7 state nearest neighbor and a 4 state 
next-nearest neighbor CA that are capable of emulating 
a universal Turing machine. The entries in the update 
tables of these CA are only partly determined by the em- 
ulated Turing machine and can be completed at will. We 
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FIG. 6: (Color online) Coarse graining of rule 110. (a) shows a trajectory of rule 110. (b) shows a coarse graining of rule 110 by 
a 63 color CA. (c) shows the percentage of supercell states that can be eliminated when coarse graining rule 110 with different 
supercell sizes A'^. (d) shows the percentage of "Garden of Eden" states out of the 2^ possible states of supercell N versions of 
rule 110. 



found that for certain completion choices these two uni- 
versal CA can be coarse-grained to a trivial CA which 
like rule decay to a quiescent configuration in a single 
time step. Another universal CA that can undergo such 
a transition is Wolfram's 19 state, next-nearest neighbor 
universal C A Q . These results are essentially equivalent 
to the rule 110 —> rule transition. 

A more interesting example is Albert and Culik's 
universal CA. It is a 14 state nearest-neighbor CA which 
is capable of emulating all other CA. The transition table 
of this CA is only partly determined by its construction 
and can be completed at will. We found that when the 
empty entries in the transition function are filled by the 
copy operation 

f [Xn-l,Xn,Xn+l] = Xn , (24) 

the resulting undecidable CA has many coarse-graining 
transitions to decidable CA. In all these transitions the 
coarse-grained CA performs the copy operation Eq. H24|) 
for all {xn-i, Xn, Xn+i)- Different transitions differ in 



the projection operator and the alphabet size of the 
coarse-grained CA. Figure [7| shows a coarse-graining of 
Albert and Culik's universal CA to a 4 state copy CA. 
The coarse-grained CA captures three types of persistent 
structures that appear in the original system but is igno- 
rant of more complicated details. The supercell size used 
here is TV = 2. 



V. COARSE-GRAIN-ABILITY OF LOCAL 
PROCESSES 

In the previous section we showed that a large major- 
ity of elementary CA can be coarse-grained in space and 
time. This is rather surprising since finding a valid pro- 
jection operator is equivalent to solving Eq. Q which 
is greatly over constrained. Solutions for this equation 
should be rare for random choices of the matrix . In 
this section we show that solutions of Eq. jSJ are frequent 
because is not random but a highly structured ob- 
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FIG. 7: (Color online) Coarse graining of Albert and Culik's 12^1 14 states universal CA by a 4 state copy CA. (a) shows 
a trajectory of Albert and Culik's universal CA while (b) shows the corresponding trajectory of the coarse-grained CA. The 
supercell size used here is A'' = 2 



ject. As the supercell size N is increased, becomes 
less random and the probability of finding a valid projec- 
tion approaches unity. 

To appreciate the high success rate in coarse-graining 
elementary CA consider the following statistics. By using 
supercells of size N = 2 and considering all possible pro- 
jection operators P : {0...3} — > {0,1} we were able to 
coarse-grain approximately one third of all 256 elemen- 
tary CA rules. Recall that the coarse-graining procedure 
that we use involves two stages. In the first stage we 
generate the supercell version , a 4 color CA in the 
N = 2 case. In the second stage we look for valid pro- 
jection operators. 4 color CA that are N = 2 supercell 
versions of elementary CA are a tiny fraction of all pos- 
sible (4('*^) 3 X 10^^) 4 color CA. If we pick a random 
4 color CA and try to project it; i.e. attempt to solve 
Eq. (O with A^ replaced by an arbitrary 4 color CA, 
we find an average of one solvable instance out of ev- 
ery « 1.6 X 10^ attempts. This large difference in the 
projection probability indicates that 4 color CA which 
are supercells versions of elementary rules are not ran- 
dom. The numbers become more convincing when we go 
to larger values of N and attempt to find projections to 
random 2^ color CA. 

To put our arguments on a more quantitative level we 
need to quantify the information content of supercell ver- 
sions of CA. An accepted measure in algorithmic informa- 
tion theory for the randomness and information content 
of an individual 01 object is its Kolmogorov complexity 
(algorithmic complexity) ^5, 46]. The Kolmogorov com- 
plexity Ku (x) of a string of characters x with respect to 
a universal computer U is defined as 



Ku (x) 



Lu{x) 



(25) 



lengthix) 

where length{x) is the length of x in bits and Lu{x) is the 



bit length of the minimal computer program that gener- 
ates X and halts on U (irrespective of the running time) . 
This definition is sensitive to the choice of machine U 
only up to an additive constant in Lij{x) which do not 
depend on x. For long strings this dependency is negli- 
gible and the subscript U can be dropped. According to 
this definition, strings which are very structured require 
short generating programs and will therefore have small 
Kolmogorov complexity. For example, a periodic x with 
period p can be generated by a ~ p long program and 
K{x) p/length{x). In contrast, if x has no structure 
it must be generated literally, i.e. the shortest program 
is "print(x)". In such cases L{x) ^ length{x), K{x) ^ 1 
and the information content of x is maximal. By using 
simple counting arguments [45l | it is easy to show that 
simple objects are rare and that K{x) ~ 1 for most ob- 
jects X. Kolmogorov complexity is a powerful and elegant 
concept which comes with an annoying limitation. It is 
uncomputable, i.e. it is impossible to find the length of 
the minimal program that generates a string x. It is only 
possible to bound it. 

It is easy to see that supercell CA are highly structured 
objects by looking at their Kolmogorov complexity. Con- 
sider the CA A= {a [t) , S, /a) and its iV'th supercell ver- 
sion A'^ = (a^, S"^, /^jv) (for simplicity of notation we 
omit the subscript A from the alphabet size). The tran- 
sition function /^w is a table that specifies a cell's new 
state for all S^'^ possible local configurations (assuming 
A is nearest neighbor and one dimensional), /^w can 
therefore be described by a string of S^^ symbols from 
the alphabet {0 ... S*^ - 1}. The bit length length (/^n) 
of such a description is 



length (/^n) 



:;3N 



Nlog^S. 



(26) 



If A'^ was a typical CA with S'^ colors we could ex- 
pect that L(/^jv), the length of the minimal program 
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that generates /a", will not differ significantly from 
length {f j^n) . However, since is a super cell version 
of A we have a much shorter description, i.e. to con- 
struct A^ from A. This construction involves running 
A, N time steps for all possible initial configurations of 
iN cells. It can be conveniently coded in a program as 
repeated applications of the transition function /a within 
several loops. Up to an additive constant^!, the length 
of such a program will be equal to the bit length descrip- 
tion of Ja'- 



(27) 



Note that we have used L to indicate that this is an upper 
bound for the length of the minimal program that gener- 
ates fj^N . This upper bound, however, should be tight for 
an update rule with little structure. The Kolmogorov 
complexity of f^N can consequently be bounded by 



LifA 



length (/^jv) 



(28) 

This complexity approaches zero at large values of N. 

Our argument above shows that the large scale behav- 
ior of CA (or any local process) must be simple in some 
sense. We would like to continue this line of reasoning 
and conjecture that the small Kolmogorov complexity of 
the large scale behavior is related to our ability to coarse- 
grain many CA. At present we are unable to prove this 
conjecture analytically, and must therefore resort to nu- 
merical evidence which we present below. 



A. Garden of Eden states of supercell CA 

Ideally, in order to show that such a connection exists 
one would attempt to coarse-grain CA with different al- 
phabets and on different length scales (supercell sizes), 
and verify that the success rate correlates with the Kol- 
mogorov complexity of the generated supercell CA. This, 
however, is computationally very challenging and going 
beyond CA with a binary alphabet and supercell sizes of 
more than A'^ = 4 is not realistic. A more modest ex- 
periment is the following. We start with a CA A with 
an alphabet S, and check whether its N supercell version 
A^ contains all possible states. Namely, if there exist 
X e {0 ... 5^ - 1} such that 

/a« (2/1, y2, ys) a; , Vyi, 2/2, 2/3 e {0 . . . 5^ - 1} . (29) 

Such a missing state of A^ is sometimes referred to as 
a "Garden of Eden" configuration because it can only 
appear in the initial state of A^ . Note that by the con- 
struction of A^ , a "Garden of Eden" state of A^ can 
appear only in the first A^ — 1 time steps of A and is 
therefore a generalized "Garden of Eden" of A. In cases 
where a state of A^ is missing, A can be trivially coarse- 
grained to the elementary CA rule by projecting the 
missing state of A^ to "1" and all other combinations 



to "0" . This type of trivial projection was discussed ear- 
lier in connection with the coarse-graining of rule 110. 
Finding a "Garden of Eden" state of A^ is computation- 
ally relatively easy because there is no need to calculate 
the supercell transition function /^n . It is enough to 
back-trace the evolution of A and check if all N cell com- 
binations has a 3A^ cell ancestor combination, N time 
steps in the past. 

Figurc|Sl(a) shows the statistics obtained from such an 
experiment. It exhibits the fraction Rge of CA rules with 
different alphabet sizes 5", whose A^'th supercell version is 
missing at least one state. Each data point in this figure 
was obtained by testing 10,000 CA rules. The fraction 
Rge approaches unity at large values of N, an expected 
behavior since most of the CA are irreversible. 

Figure|Sl(b) shows the same data as in (a) when plotted 
against the variable = K ■ where 5* is the alphabet 
size, K is the upper bound for the Kolmogorov complex- 
ity of the supercell CA from Eq. H28(l and C is a constant. 
The excellent data collapse imply a strong correlation be- 
tween the probability of finding a missing state and the 
Kolmogorov complexity of a supercell CA. This figure 
also shows that the data points can be accurately fitted 

by 



Rge {N, S) 



1 



1 + iC/^oT 



(30) 



with ^0 a constant and a « 0.7 (solid line in Fig. |Sl(b)). 
Having the scaling form 



Rge {N, S) 



K (N, S) ■ 



(31) 



we can now study the behavior of Rge with large alphabet 
sizes. Assuming F and ^ to be continuous we define S^h as 
the point where F (^/i) = 1/2. For a fixed value of S, the 
slope of Rge at the transition region can be calculated by 



ge 



ON 



^'(a)-§ 



3 logs') 



(32) 



where 



3 log S + 5 • log C - log a - log N{£,h) 



3 -logS 



S 



log 5 



(33) 



Putting together Eqs. H32() and H33|l wc find that the slope 
of Rge at the transition region grows as log S for large 
values of 5*. An indication of this phenomena can be 
seen in Fig.|Hl(a) which shows sharper transitions at large 
values of S. In the limit of large S", Rge becomes a step 
function with respect to N . This fact introduces a critical 
value Ne{S) such that for N < Ne{S) the probabihty of 
finding a missing state is zero and for N > Nc{S) the 
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a) b) 




FIG. 8: (a) The fraction Rg^ of CA whose TV'th superceU version has at least one missing state. Different symbols correspond 
to different alphabet sizes S of the original CA. (b) Data collapse of the curves Rge{N,S) from a) when plotted against the 
scaling variable ^ = K{N, S) ■ . The solid line shows that the scaling function can be fitted by Eq. II3UII . 



probability is one. The value of this critical N grows with 
the alphabet size as N^S) ^ S/\ogS. Note that Nc{S) 
is an emergent length scale, as it is not present in any 
of the CA rules, but according to the above analysis will 
emerge (with probability one) in their dynamics. A direct 
consequence of the emergence of Nc is that a measure 1 
of all CA can be coarse-grained to the elementary rule 
"0" on the coarse-grained scale Nc- 

B. Projection probability of CA rules with 
bounded Kolmogorov complexity 

Generalized "Garden of Eden" states are a specific 
form of emergent pattern that can be encountered in the 
large scale dynamics of CA. Is the Kolmogorov complex- 
ity of CA rules related to other types of coarse-grained 
behavior? To explore this question we attempted to 
project (solve Eq. (O) random CA with bounded Kol- 
mogorov complexities. 

To generate a random CA A = {a{t), S, /a) with a 
bounded Kolmogorov complexity we view the update rule 
Ja as a string of log2 S bits, denote the i'th bit by 
(/^)i and apply the following procedure: 1) Randomly 
pick the first I bits of /a- 2) Randomly pick a generating 
function G : {0, 1}' {0, 1}. 3) Set the values of all the 
empty bits of /a by applying G: 

Ua\ = G [(/a).-,, ifAh-i+i, • • • , (./a),-i] , (34) 

starting at i = I + 1 and finishing at i = S''^ log^ S. Up 
to an additive constant, the length of such a procedure is 
equal to 1 + 2'', the number of random bits chosen. The 
Kolmogorov complexity of the resulting rule table can 
therefore be bounded by 

^iM^^iM = ^^- (35) 



For small values of I this is a reasonable upper bound. 
However for large values of I this upper bound is obvi- 
ously not tight since the size of G can be much larger 
than the length of /a- 

Using the above procedure wc studied the probability 
of projecting CA with different alphabets and different 
upper bound Kolmogorov complexities K. For given val- 
ues of S and I we generated 10,000 (200 for the S* = 32 
case) CA and tried to find a valid projection on the {0, 1} 
alphabet. Figure (a) shows the fraction Rproj of solv- 
able instances as a function oi ^ = K ■ G^ . The constant 
C used for this data collapse is 1.02, very close to 1. 
As valid projection solutions we considered all possible 
projections P : 5'^ — > {0, 1}. In doing so we may be re- 
doing the missing states experiment because many low 
Kolmogorov complexity rules has missing states and can 
thus be trivially projected. In order to exclude this op- 
tion we repeated the same experiment while restricting 
the family of allowed projections to be equal partitions 
of{0---S'3-l}, i.e. 

P:S'^ {0, 1} ; \{x : P{x) = 0}| - \{x : P{x) = 1}| . 

(36) 

The results are shown if Fig. |5| (b). 

It seems that in both cases there is a good correlation 
between the Kolmogorov complexity (or its upper bound) 
of a CA rule and the probability of finding a valid pro- 
jection. In particular, the fraction of solvable instances 
goes to one at the low K limit. As shown by the solid 
lines in Fig. |51 this fraction can again be fitted by 



where is a constant and in this case awl. 

How many of the CA rules that we generate and 
project show a complex behavior? Does the fraction of 
projcctable rules simply reficct the fraction of simple be- 
having rules? To answer this question we studied the 
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FIG. 9: (a) and (b) show the fraction Rproj of Kolmogorov complexity bounded CA that has a vahd projection on the binary 
alphabet. CA were generated using a random generating function with I variables according to the procedure described above. 
K (Eq. linSJ) is the resulting upper bound Kolmogorov complexity. Different symbols correspond to different alphabet sizes. 
Insets show the data as a function of the parameter /. (a) shows results in the case where all projections P : 5'' — > {0, 1} are 
allowed, (b) shows results in the case where only equal partition projections (Eq. 1361 ') are allowed. Solid lines in (a) and (b) 
shows a fit by Eq. II37II . (c) shows the fraction of complex behaving rules which are produced by our procedure as a function 
oil. 



rules generated by our procedure. For each value of S 
and I we generated 100 rules and counted the number of 
rules exhibiting complex behavior. A rule was labelled 
"complex" if it showed class 3 or 4 behavior and exhib- 
ited a complex sensitivity to perturbations in the initial 
conditions. Fig. O (c) shows the statistics we obtained 
with different alphabet sizes as a function of K while 
the inset shows it as a function of I. We first note that 
our statistics support Dubacq ct al. who proposed 

that rule tables with low Kolmogorov complexities lead 
to simple behavior and rule tables with large Kolmogorov 
complexity lead to complex behavior. Moreover, our re- 
sults show that the fraction of complex rules does not 
depend on the alphabet size and is only a function of I. 
Rules with larger alphabets show complex behavior at 
a lower value of As a consequence, a large fraction 
of projectable rules are complex and this fraction grows 
with the alphabet size S. 

As we explained earlier, the Kolmogorov complexity of 



supercell versions of CA approaches zero as the supercell 
size N is increased. Our experiments therefore indicate 
that a measure one of all C A are coarse-grained-able if we 
use a coarse enough scale. Moreover, the data collapse 
that we obtain and the sharp transition of the scaling 
function suggest that it may be possible to know in ad- 
vance at what length scales to look for valid projections. 
This can be very useful when attempting to coarse-grain 
CA or other dynamical systems because it can narrow 
down the search domain. As in the case of "Garden of 
Eden" states that we studied earlier, we interpret the 
transition point as an emergent scale which above it we 
arc likely to find self organized patterns. Note however 
that this scale is a little shifted in Fig. (b) when com- 
pared with Fig. (a). The emergence scale is thus sen- 
sitive to the types of large scale patterns we are looking 
for. 
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VI. SUMMARY AND DISCUSSION 

111 this work we studied emergent phenomena in com- 
plex systems and the associated predictabihty problems 
by attempting to coarse-grain CA. We found that many 
elementary CA can be coarse-grained in space and time 
and that in some cases complex, undecidable CA can 
be coarse-grained to decidable and predictable CA. We 
conclude from this fact that undecidability and compu- 
tational irreducibility are not good measures for physical 
complexity. Physical complexity, as opposed to compu- 
tational complexity should address the interesting, phys- 
ically relevant, coarse-grained degrees of freedom. These 
coarse-grained degrees of freedom maybe simple and pre- 
dictable even when the microscopic behavior is very com- 
plex. 

The above definition of physical complexity brings 
about the question of the objectivity of macroscopic de- 
scriptions 0, • Is our choice of a coarse-grained de- 
scription (and its consequent complexity) subjective or is 
it dictated by the system? Our results are in accordance 
with Shalizi and Moore ji^: it is both. In many cases 
we discovered that a particular CA can undergo different 
coarse-graining transitions using different projection op- 
erators. In these cases the system dictates a set of valid 
projection operators and we are restricted to choose our 
coarse-grained description from this set. We do however 
have some freedom to manifest our subjective interest. 

The coarse-graining transitions that we found induce 
a hierarchy on the family of elementary CA (see Fig. 

Moreover, it seems that rule complexity never in- 
creases with coarse- graining transitions. The coarse- 
graining hierarchy therefore provides a partial complex- 
ity order of CA where complex rules are found at the 
top of the hierarchy and simple rules are at the bottom. 
The order is partial because we cannot relate rules which 
are not connected by coarse-graining transitions. This 
coarse-graining hierarchy can be used as a new classifica- 
tion scheme of CA. Unlike Wolfram's, classification this 
scheme is not a topological one since the basis of our 
suggested classification is not the CA trajectories. Nor is 
this scheme parametric, such as Langton's A parameter 
scheme. Our scheme reflects similarities in the algebraic 
properties of CA rules. It simply says that if some coarse- 
grained aspects of rule A can be captured by the detailed 
dynamics of rule B then rule A is at least as complex as 
rule B. Rule A maybe more complex because in some 
cases it can do more than its projection. Note that our 
hierarchy may subdivide Wolfram's classes. For example 
rule 128 is higher on the hierarchy than rule 0. These two 
rules belong to class 1 but rule 128 can be coarse-grained 
to rule and it is clear that an opposite transition cannot 
exist. It will be interesting to find out if class 3 and 4 
can also be subdivided. 

In the last part of this work we tried to understand why 
is it possible to find so many coarse-graining transitions 
between CA. At first blush, it seems that coarse-graining 
transitions should be rare because finding valid projec- 



tion operators is an over constrained problem. This was 
our initial intuition when we first attempted to coarse- 
grain CA. To our surprise we found that many CA can 
undergo coarse-graining transitions. 

A more careful investigation of the above question sug- 
gests that finding valid projection operators is possible 
because of the structure of the rules which govern the 
large scale dynamics. These large scale rules are update 
functions for supercells, whose tables can be computed 
directly from the single cell update function. They thus 
contain the same amount of information as the single cell 
rule. Their size however grows with the supercell size and 
therefore they have vanishing Kolmogorov Complexities. 

In other words, the large scale update functions are 
highly structured objects. They contain many regulari- 
ties which can be used for finding valid projection oper- 
ators. We did not give a formal proof for this statement 
but provided a strong experimental evidence. In our ex- 
periments we discovered that the probability to find a 
valid projection is a universal function of the Kolmogorov 
Complexity of the supercell update rule. This univer- 
sal probability function varies from zero at large Kol- 
mogorov Complexity (small supercells) to one at small 
Kolmogorov Complexity (large supercells) . It is therefore 
very likely that we find many coarse-graining transitions 
when we go to large enough scales. 

Our interpretation of the above results is that of emer- 
gence. When we go to large enough scales we are likely to 
find dynamically identifiable large scale patterns. These 
patterns arc emergent (or self organized) because they 
do not explicitly exist in the original single cell rules. 
The large scale patterns arc forced upon the system by 
the lack of information. Namely, the system (the up- 
date rule, not the cell lattice) does not contain enough 
information to be complex at large scales. 

Finding a projection operator is one specific type of 
an over constrained problem. Motivated by our results 
we looked into other t ype s of over constrained problems. 
The satisfyability^^ l50l | problem (k-sat) is a general- 
ized (NP complete) form of constraint satisfaction sys- 
tem. We generated random 3-sat instances with differ- 
ent number of variables deep in the un-sat region of pa- 
rameter space. The generated instances however were 
not completely random and were generated by generat- 
ing functions. The generating functions controlled the 
instance's Kolmogorov complexity, in the same way that 
we used in section IVBI We found^| that the probabil- 
ity for these instances to be satisfiable obeys the same 
universal probability function of Eq. 1371) . It will be in- 
teresting to understand the origin of this universality and 
its implications. 

In this work, we have restricted ourselves to deal with 
CA because it is relatively easy to look for valid projec- 
tion operators for them. A greater (and more practical) 
challenge will now be to try and coarse-grain more so- 
phisticated dynamical systems such as probabilistic CA, 
coupled maps and partial differential equations. These 
types of systems are among the main work horses of sci- 
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entific modelling, and being able to coarse-grain them 
will be very useful, and is a topic of current research, e.g. 
in material science [H^. It will be interesting to see if one 
can derive an emergence length scale for those systems 
like the one we found for "Garden of Eden" sequences in 
CA f section IV A|l . Such an emergence length scale can 
assist in finding valid projection operators by narrowing 
the search to a particular scale. 
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